CS231n学习笔记--8.Deep Learning Software

1.CPU & GPU

GPU更擅长做并行化简单任务处理，如矩阵运算等！

GPU开源库：

CPU vs GPU:

CPU/GPU通信：

2. Deep Learning Frameworks

利用Numpy手动构建神经网络：

利用TensorFlow构建神经网络：

利用PyTorch构建神经网络：

3. TensorFlow

利用TensorFlow构建神经网络容易出现以下问题：

解决方法：

Change w1 and w2 from placeholder (fed on each call) to Variable(persists in the graph between calls)
Add assign operations to update w1 and w2 as part of the graph!
Run graph once to initialize w1 and w2
Add dummy graph node that depends on updates
Tell graph to compute dummy node

Tips：
实质上是通过构建假的权重网络结点，将权重值保留在内存中，而该网络结点实际上是没有输出，故不会运行！

4. PyTorch

PyTorch与TensorFlow的区别：

利用PyTorch构建网络：

如何在GPU上运行PyTorch代码：

PyTorch可自动计算梯度（x,y不需要）：

利用PyTorch可自定义梯度计算公式：

PyTorch的nn模型：

利用PyTorch对训练数据进行minbath：

PyTorch的预训练模型库：

Static vs Dynamic Graphs：

Static 比 Dynamic Graphs更有利用优化模型：

因为Static Graphs在正式训练前模型架构已经确定，所以第一次训练后不需要用指定的代码，可以执行C代码，这样运行效率更高！

在条件分支选择中，dynamic模型更加简洁：

在循环网络中，dynamic模型更加简洁：
（TensorFlow Fold make dynamic graphs easier in TensorFlow through dynamic batching）

Dynamic Graph Applications：

5. Caffe

特点：

● Core written in C++
● Has Python and MATLAB bindings
● Good for training or finetuning feedforward classification models
● Often no need to write code!
● Not used as much in research anymore, still popular for deploying models

使用步骤：

1. Convert data (run a script)

●  DataLayer reading from LMDB is the easiest
●  Create LMDB using convert_imageset
●  Need text file where each line is
    ○ “[path/to/image.jpeg] [label]”
●  Create HDF5 file yourself using h5py
●  ImageDataLayer: Read from image files
●  WindowDataLayer: For detection
●  HDF5Layer: Read from HDF5 file
●  From memory, using Python interface
●  All of these are harder to use (except Python)

2. Define net (edit prototxt)

3. Define solver (edit prototxt)

4. Train (with pretrained weights) (run a script)

Caffe的相关资料：

Caffe的优缺点：

● Interfacing with numpy
● Extract features: Run net forward
● Compute gradients: Run net backward (DeepDream, etc)
● Define layers in Python with numpy (CPU only)
● (+) Good for feedforward networks
● (+) Good for finetuning existing networks
● (+) Train models without writing any code!
● (+) Python interface is pretty useful!
● (+) Can deploy without Python
● (-) Need to write C++ / CUDA for new GPU layers
● (-) Not good for recurrent networks
● (-) Cumbersome for big networks (GoogLeNet, ResNet)

Caffe2 Overview
● Static graphs, somewhat similar to TensorFlow
● Core written in C++
● Nice Python interface
● Can train model in Python, then serialize and deploy
without Python
● Works on iOS / Android, etc

模型使用建议：